Drawing Inference

Lucy D’Agostino McGowan

Data

data_sample 
# A tibble: 29 × 5
      id screen_time battery_percent campus_location pit_meals
   <int>       <dbl>           <dbl> <chr>               <dbl>
 1     9         493              77 Quad                    2
 2     9         229              36 Quad                    5
 3     9         281              11 Quad                    4
 4     9         176              41 Quad                    2
 5     9         303              75 Quad                    2
 6     9         588              66 North Campus            9
 7     9         205              53 Quad                    4
 8     9         279              56 Quad                    3
 9     9         387              45 Quad                    3
10     9         278              33 Quad                    5
# ℹ 19 more rows

Survey data

Code
ggplot(data_sample, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent")

Full Survey data

Code
ggplot(data, aes(x = screen_time)) + 
  geom_histogram(bins = 50)
ggplot(data, aes(x = battery_percent)) + 
  geom_histogram(bins = 50)

Data Cleaning

data_clean <- data |>
  filter(battery_percent <= 100) |>
  mutate(screen_time = ifelse(screen_time == 1210, 132, screen_time))

Survey data

Code
ggplot(data_clean, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  geom_point(data = data_sample, color = "cornflower blue") + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent")

Survey data

Code
ggplot(data_clean, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE, formula = "y ~ x", lty = 2, color = "orange") +
  geom_point(data = data_sample, color = "cornflower blue") + 
  geom_smooth(data = data_sample, method = "lm", se = FALSE, 
              formula = "y ~ x", color = "cornflower blue") + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent",
       caption = "ID: 9")

Survey data

Code
ggplot(data_clean, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE, formula = "y ~ x", lty = 2, color = "orange") +
  geom_point(data = data[data$id == 43,], color = "cornflower blue") + 
  geom_smooth(data = data[data$id == 43,], method = "lm", se = FALSE, 
              formula = "y ~ x", color = "cornflower blue") + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent",
       caption = "ID: 43")

Survey data

Code
ggplot(data_clean, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE, formula = "y ~ x", lty = 2, color = "orange") +
  geom_point(data = data[data$id == 43,], color = "cornflower blue") + 
  geom_smooth(data = data[data$id == 43,], method = "lm", se = FALSE, 
              formula = "y ~ x", color = "cornflower blue") + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent",
       caption = "ID: 39")

Survey data

What if I want to know the relationship between screen time and battery percent for Wake Forest Students?


How can we quantify how much we’d expect the slope to differ from one random sample to another?

  • We need a measure of uncertainty
  • How about the standard error of the slope?
  • The standard error is how much we expect \(\hat{\beta}_1\) to vary from one random sample to another.

Survey data

How can we quantify how much we’d expect the slope to differ from one random sample to another?

mod <- lm(battery_percent ~ screen_time, data = data_sample)
summary(mod)

Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.346 -15.632   1.009  15.228  43.459 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 59.13539   13.51816   4.375 0.000163 ***
screen_time -0.01422    0.04120  -0.345 0.732609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.75 on 27 degrees of freedom
Multiple R-squared:  0.004394,  Adjusted R-squared:  -0.03248 
F-statistic: 0.1192 on 1 and 27 DF,  p-value: 0.7326

Survey data

We need a test statistic that incorporates \(\hat{\beta}_1\) and the standard error

mod <- lm(battery_percent ~ screen_time, data = data_sample)
summary(mod)

Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.346 -15.632   1.009  15.228  43.459 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 59.13539   13.51816   4.375 0.000163 ***
screen_time -0.01422    0.04120  -0.345 0.732609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.75 on 27 degrees of freedom
Multiple R-squared:  0.004394,  Adjusted R-squared:  -0.03248 
F-statistic: 0.1192 on 1 and 27 DF,  p-value: 0.7326
  • \(t = \frac{\hat{\beta}_1}{SE_{\hat{\beta}_1}}\)

Survey data

How do we interpret this?

mod <- lm(battery_percent ~ screen_time, data = data_sample)
summary(mod)

Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.346 -15.632   1.009  15.228  43.459 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 59.13539   13.51816   4.375 0.000163 ***
screen_time -0.01422    0.04120  -0.345 0.732609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.75 on 27 degrees of freedom
Multiple R-squared:  0.004394,  Adjusted R-squared:  -0.03248 
F-statistic: 0.1192 on 1 and 27 DF,  p-value: 0.7326
  • \(\hat{\beta}_1\) is more than -0.35 standard errors above a slope of zero”

Survey data

How do we know what values of this statistic are worth paying attention to?

mod <- lm(battery_percent ~ screen_time, data = data_sample)
summary(mod)

Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.346 -15.632   1.009  15.228  43.459 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 59.13539   13.51816   4.375 0.000163 ***
screen_time -0.01422    0.04120  -0.345 0.732609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.75 on 27 degrees of freedom
Multiple R-squared:  0.004394,  Adjusted R-squared:  -0.03248 
F-statistic: 0.1192 on 1 and 27 DF,  p-value: 0.7326
  • confidence intervals, p-values
  • Hypothesis testing: \(H_0: \beta_1 = 0\) \(H_A: \beta_1 \neq 0\)

Survey data

How do get a confidence interval for \(\hat{\beta}_1\)? What function can we use in R?


confint(mod)
                  2.5 %      97.5 %
(Intercept) 31.39842927 86.87235645
screen_time -0.09876341  0.07031636

How do we interpret this value?

Application Exercise

  1. Open appex-07.qmd
  2. Fit the model of battery_percent and screen_time in your data
  3. Calculate a confidence interval for the estimate \(\hat\beta_1\)
  4. Interpret this value
05:00

Hypothesis testing

  • So far, we have estimated the relationship between screen time and battery percent
  • This could be useful if we wanted to understand, on average, how these variables are related (estimation)
  • This could also be useful if we wanted to guess how long a leaf was based on how wide it is (prediction)
  • What if we just want to know whether there is some relationship bewteen the two? (hypothesis testing)

Hypothesis testing

  • Null hypothesis: There is no relationship between screen time and battery percent
    • \(H_0: \beta_1 = 0\)
  • Alternative hypothesis: There is a relationship between screen time and battery percent
    • \(H_A: \beta_1 \neq 0\)

Hypothesis testing

summary(mod)

Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.346 -15.632   1.009  15.228  43.459 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 59.13539   13.51816   4.375 0.000163 ***
screen_time -0.01422    0.04120  -0.345 0.732609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.75 on 27 degrees of freedom
Multiple R-squared:  0.004394,  Adjusted R-squared:  -0.03248 
F-statistic: 0.1192 on 1 and 27 DF,  p-value: 0.7326

Is \(\hat\beta_1\) different from 0?

Hypothesis testing

summary(mod)

Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.346 -15.632   1.009  15.228  43.459 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 59.13539   13.51816   4.375 0.000163 ***
screen_time -0.01422    0.04120  -0.345 0.732609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.75 on 27 degrees of freedom
Multiple R-squared:  0.004394,  Adjusted R-squared:  -0.03248 
F-statistic: 0.1192 on 1 and 27 DF,  p-value: 0.7326

Is \(\beta_1\) different from 0? (notice the lack of the hat!)

p-value

The probability of observing a statistic as extreme or more extreme than the observed test statistic given the null hypothesis is true

p-value

summary(mod)

Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.346 -15.632   1.009  15.228  43.459 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 59.13539   13.51816   4.375 0.000163 ***
screen_time -0.01422    0.04120  -0.345 0.732609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.75 on 27 degrees of freedom
Multiple R-squared:  0.004394,  Adjusted R-squared:  -0.03248 
F-statistic: 0.1192 on 1 and 27 DF,  p-value: 0.7326

What is the p-value? What is the interpretation?

Hypothesis testing

  • Null hypothesis: \(\beta_1 = 0\) (there is no relationship between screen time and battery percent)
  • Alternative hypothesis: \(\beta_1 \neq 0\) (there is a relationship between screen time and battery percent)
  • Often we have an \(\alpha\) level cutoff to compare the p-value to, for example 0.05.
  • If p-value < 0.05, we reject the null hypothesis
  • If p-value > 0.05, we fail to reject the null hypothesis
  • Why don’t we ever “accept” the null hypothesis?
  • absense of evidence is not evidence of absense

p-value

summary(mod)

Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.346 -15.632   1.009  15.228  43.459 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 59.13539   13.51816   4.375 0.000163 ***
screen_time -0.01422    0.04120  -0.345 0.732609    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.75 on 27 degrees of freedom
Multiple R-squared:  0.004394,  Adjusted R-squared:  -0.03248 
F-statistic: 0.1192 on 1 and 27 DF,  p-value: 0.7326

Do we reject the null hypothesis?

Application Exercise

  1. Open appex-07.qmd
  2. Examine the summary of the model of battery_percent and screen_time with your data
  3. Test the null hypothesis that there is no relationship between screen time and battery percent
  4. What is the p-value? What is the result of your hypothesis test?
  5. Turn this in on Canvas
02:00

Survey Data

Code
data_clean |>
  nest_by(id) |>
  mutate(model = list(lm(battery_percent ~ screen_time, data = data))) |>
  reframe(broom::tidy(model, conf.int = TRUE)) |>
  filter(term == "screen_time") |>
  mutate(yes = ifelse(conf.low < -0.03608045 & conf.high > -0.03608045, 1, 0)) |>
  ggplot(aes(y = factor(id), xmin = conf.low, x = estimate, xmax = conf.high, color = yes)) +
  geom_pointrange() +
  geom_vline(xintercept = -0.03608045, lty = 2) + 
  theme(legend.position = "none") + 
  ylab("id") + 
  xlab("Average Daily Screen Time (Minutes)")